翻訳と辞書
Words near each other
・ "O" Is for Outlaw
・ "O"-Jung.Ban.Hap.
・ "Ode-to-Napoleon" hexachord
・ "Oh Yeah!" Live
・ "Our Contemporary" regional art exhibition (Leningrad, 1975)
・ "P" Is for Peril
・ "Pimpernel" Smith
・ "Polish death camp" controversy
・ "Pro knigi" ("About books")
・ "Prosopa" Greek Television Awards
・ "Pussy Cats" Starring the Walkmen
・ "Q" Is for Quarry
・ "R" Is for Ricochet
・ "R" The King (2016 film)
・ "Rags" Ragland
・ ! (album)
・ ! (disambiguation)
・ !!
・ !!!
・ !!! (album)
・ !!Destroy-Oh-Boy!!
・ !Action Pact!
・ !Arriba! La Pachanga
・ !Hero
・ !Hero (album)
・ !Kung language
・ !Oka Tokat
・ !PAUS3
・ !T.O.O.H.!
・ !Women Art Revolution


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

Kullback-Leibler divergence : ウィキペディア英語版
Kullback–Leibler divergence

In probability theory and information theory, the Kullback–Leibler divergence〔Kullback S. (1959), ''Information Theory and Statistics'' (John Wiley & Sons).〕 (also information divergence, information gain, relative entropy, KLIC, or KL divergence) is a measure of the difference between two probability distributions ''P'' and ''Q''. It is not symmetric in ''P'' and ''Q''. In applications, ''P'' typically represents the "true" distribution of data, observations, or a precisely calculated theoretical distribution, while ''Q'' typically represents a theory, model, description, or approximation of ''P''.
Specifically, the Kullback–Leibler divergence of ''Q'' from ''P'', denoted ''D''KL(''P''‖''Q''), is a measure of the information gained when one revises ones beliefs from the prior probability distribution ''Q'' to the posterior probability distribution ''P''. More exactly, it is the amount of information that is ''lost'' when ''Q'' is used to approximate ''P'',〔Burnham K.P., Anderson D.R. (2002), ''Model Selection and Multi-Model Inference'' (Springer). (2nd edition), p.(51 )〕 defined operationally as the expected extra number of bits required to code samples from ''P'' using a code optimized for ''Q'' rather than the code optimized for ''P''.
Although it is often intuited as a way of measuring the distance between probability distributions, the Kullback–Leibler divergence is not a true metric. It does not obey the triangle inequality, and in general ''D''KL(''P''‖''Q'') does not equal ''D''KL(''Q''‖''P''). However, its infinitesimal form, specifically its Hessian, gives a metric tensor known as the Fisher information metric.
The Kullback–Leibler divergence is a special case of a broader class of divergences called ''f''-divergences, which in turn are a special case of Bregman divergences.
The Kullback–Leibler divergence was originally introduced by Solomon Kullback and Richard Leibler in 1951 as the directed divergence between two distributions. It is discussed in Kullback's historic text, ''Information Theory and Statistics''.〔
The Kullback–Leibler divergence is sometimes also called the information gain achieved if ''P'' is used instead of ''Q''. It is also called the relative entropy of ''P'' with respect to ''Q'', and written ''H(P|Q)''.
==Definition==
For discrete probability distributions ''P'' and ''Q'',
the Kullback–Leibler divergence of ''Q'' from ''P'' is defined to be
:D_.
In words, it is the expectation of the logarithmic difference between the probabilities ''P'' and ''Q'', where the expectation is taken using the probabilities ''P''. The Kullback–Leibler divergence is defined only if ''Q''(''i'')=0 implies ''P''(''i'')=0, for all ''i'' (absolute continuity). Whenever ''P''(''i'') is zero the contribution of the ''i''-th term is interpreted as zero because \lim_ x \log(x) = 0.
For distributions ''P'' and ''Q'' of a continuous random variable, the Kullback–Leibler divergence is defined to be the integral:〔Bishop C. (2006). ''Pattern Recognition and Machine Learning'' p. 55.〕
: D_^\infty p(x) \, \log\frac \, x, \!
where ''p'' and ''q'' denote the densities of ''P'' and ''Q''.
More generally, if ''P'' and ''Q'' are probability
measures over a set ''X'', and ''P''
is absolutely continuous with respect to ''Q'', then
the Kullback–Leibler
divergence from ''P'' to ''Q'' is defined as
: D_P}Q}\right) \fracP}Q} \, Q,
which we recognize as the entropy of ''P'' relative to ''Q''. Continuing in this case, if \mu is any measure on ''X'' for which p = \fracP}\mu} and q = \fracQ}\mu} exist (meaning that ''p'' and ''q'' are absolutely continuous with respect to \mu), then the Kullback–Leibler divergence from ''P'' to ''Q'' is given as
: D_ \, \mu.
\!
The logarithms in these formulae are taken to base 2 if information is measured in units of bits, or to base ''e'' if information is measured in nats. Most formulas involving the Kullback–Leibler divergence hold regardless of the base of the logarithm.
Various conventions exist for referring to ''D''KL(''P''‖''Q'') in words. Often it is referred to as the divergence ''between'' ''P'' and ''Q''; however this fails to convey the fundamental asymmetry in the relation. Sometimes it may be found described as the divergence of ''P'' from, or with respect to ''Q'' (often in the context of relative entropy, or information gain). However, in the present article the divergence of ''Q'' from ''P'' will be the language used, as this best relates to the idea that it is ''P'' that is considered the underlying "true" or "best guess" distribution, that expectations will be calculated with reference to, while ''Q'' is some divergent, less good, approximate distribution.

抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「Kullback–Leibler divergence」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.